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ABSTRACT 

Aims. We identify and study a previously unknown systematic effect on cosmic shear measurements, caused by the selection of 
galaxies used for shape measurement, in particular the rejection of close (blended) galaxy pairs. 

Methods. We use ray-tracing simulations based on the Millennium Simulation and a semi-analytical model of galaxy formation to 
create realistic galaxy catalogues. From these, we quantify the bias in the shear correlation functions by comparing measurements 
made from galaxy catalogues with and without removal of close pairs. A likelihood analysis is used to quantify the resulting shift in 
estimates of cosmological parameters. 

Results. The filtering of objects with close neighbours (a) changes the redshift distribution of the galaxies used for correlation function 
measurements, and (b) correlates the number density of sources in the background with the density field in the foreground. This leads 
to a scale-dependent bias of the correlation function of several percent, translating into biases of cosmological parameters of similar 
amplitude. This makes this new systematic effect potentially harmful for upcoming and planned cosmic shear surveys. As a remedy, 
we propose and test a weighting scheme that can significantly reduce the bias. 



1. Introduction 

In preparation for upcoming and planned large cosmic shear 
surveys, suc h as PAN-STARRS ( K aiser & Pan-STARR S 
Collaboration l2005h . KIDS: 1 ! or Euclid dRefregier et al.ll2010h . 
, it is vital to find and quantify possible sources of systematic 
effects that can hamper the full exploitation of the informa- 
tion contained in these large data sets. A number of such ef- 
fects have already been identified. The most fundamental prob- 
lem on the observational side is to obtain unbiased estimates of 
the shapes of galaxies. The difficulty of this h as been demon- 
strated, for example, by t he STEP programme (Hevman s et alj 
; 120061: iMassev et al.ll2007h and the GREAT08 challenge ( Bridle 
, et al. l2010h . where several shape measurement methods have 
been tested on mock data. Further, it is crucial to obtain reliable 
photometric redshifts to obtain an accurate redshift distribution 
of the galaxy sample under consideration, which is needed for 
accurate theoretical predictions, and also to allow the construc- 
\ tion of redshift bins for shear tomography dHearin et al . 2010; 
. iBernstein & H uterer 2010t lMa et all 2006). Intrinsic alignments 
of physically close galaxies and shape-shear alignments proba- 
bly constitute the most severe physical contaminant of the cos- 
mic shear signal. The first can be reduced by rem oving phys- 
ically cl ose pairs dKing & Schneideil 20021. 120031: Hevmans & 
Heavens l2003l : lTakada & Whitell2004l) . whereas the influence of 
the lat ter can either be removed by the s o-called nulling tech- 
nique (U oachimi & Schneider 200 81 120091) or by self-calibration 
(Zhana l2010UJoachimi & Bridlell2010h . Another physical con- 
tamination is caused by the magnification effect. Density fluctua- 
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tions in the foreground can, depending on the slope of the galaxy 
number count, either enhance or deplete the number of back- 
ground galaxies, thus correlating the density field in the fore- 
ground wit h the galaxy distri bution that is used to estimate its 
shear field (ISchmidt et al.ll2009h . 

In order to achieve percent-level accuracy, the difference be- 
tween shear and reduced shear also needs to be taken into ac- 
count. Theoretical predictions for the shear correlation functions 
can be o btained from the matter power spectrum with relative 
ease (e.g. Bartelmann & Schneider 2001), but for the computa- 
tion of the actually observable reduced shear correlation func- 
tions it is necessary t o include higher-order corrections t o the 
shear power spectrum (White 2005c iKrause & Hiratall20Tol) . 

Finally, the process of parameter estimation requires great 
care as well. For example, the likelihood of the shear correla- 
tion functions has been shown to be significantl y non-Gaussian 
(lHartlap et al.ll2QQ9t [Schneider & Hartlapll2009l) . This may also 
apply to other two-point statistics derived from the correlation 
functions. Furthermore, even if a Gaussian likelihood is as- 
sumed, the cosmology dependence of the covariance matrix of 
the statistics un der consideration should be taken into account, as 
was shown in lEifler et alj (120091) . Neglecting these issues could 
introduce non-negligible biases to estimates of cosmological pa- 
rameters. 

In this paper, we add to this list a systematic effect that leads 
to a biased estimate of the shear correlation functions. This bias 
is due to the fact that the ellipticity of a galaxy cannot be es- 
timated reliably when its light distribution overlaps with that 
of a close neighbour. Therefore, it is common to discard pairs 
of galaxies that appear too close together on the sky. We argue 
that - while allowing for clean estimates of galaxy shapes - this 
practice has two adverse side effects: it changes the redshift dis- 
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tribution of source galaxies, and it correlates the lensing mass 
distribution in the foreground with the source galaxy distribu- 
tion in the background. Because of the latter issue, the product 
of the complex ellipticities of a randomly selected pair of galax- 
ies no longer yields an unbiased estimate of the shear correlation 
function. 

The article is organized as follows: after briefly reviewing 
the cosmic shear two-point statistics relevant for this paper in 
Sect. [2] we describe in Sect. [3] the ray-tracing simulations and 
semi-analytic galaxy formation models we use to create our 
mock galaxy catalogues. In Sect. |H we quantify the bias in 
the shear correlation function using our simulation results for 
various choices of galaxy selection criteria. We then propose 
a weighting scheme that can help to reduce the bias (Sect. [3j 
and discuss the impact on cosmological parameter estimation 
(Sect. [6j. We conclude the paper in Sect. [7] 

2. The shear correlation functions 

Several statistics have been developed to capture the two-point 
information that is contained in the ellipticities of distant galax- 
ies, such as the shear correlation functions (e.g. lKaisen ll992; 



ICrittenden et al.]2002l). the shear dispersion in circular aper- 
tures (e.g. Kaiserl 1 1 992h or the apertur e mass dispersion (e.g . 
ISchneider et al.lll998l.l2002ah . Recently. [Schneider et al.1 d2010h 
have proposed the so-called COSEBIs, which allow for a clean 
E-/B-mode decomposition given the shear correlation functions 
on a finite interval. These statistics are all related to the power 
spectrum of the weak lensing convergence (see, e.g., Crittenden 
et al. |2002| ; ISchneider et alj |2002b). Regarding actual measure- 
ments, the shear correlation functions £ ± are the most convenient 
of these statistics, since they can be estimated with relative ease 
from real data sets, even in the presence of gaps and masked re- 
gions. Any other two-point statistic of interest is therefore usu- 
ally computed from an estimate of the shear correlation func- 
tions. 

A practic al estimator for the she ar correlation functions is 
given by (e.g. [Schneider et al. 2002ah 

f±W - ^ 7— 777; . (!) 

Li: ifhf'iXiUI) 

where the sum runs over all pairs of galaxies, located at the an- 
gular positions The complex ellipticity of the i* galaxy is 
denoted by e,, and its tangential and cross components with re- 
spect to the line joining it to the / h galaxy are given by e t j-j and 
e x ,ij, respectively. The symbol Ay(0) is equal to one if the angu- 
lar separation 9 of the / th and / h galaxies lies in the bin centred 
on 9, and vanishes otherwise. Finally, the p, are weights assigned 
to the galaxies. For the purpose of this work, it is convenient to 
write them as p,- = to,- s,-, where s,- is a statistical weight that, for 
example, reflects the quality of the shape estimate. The "selec- 
tion weight" to,- is zero if the galaxy is too close to its nearest 
neighbour to allow for a reliable measurement of its shape, and 
unity otherwise. 

3. Ray-tracing simulations 

Our ray-tracing simulations are based on the dark matter distri - 
bution in the Millennium Simulation (MS, Snrin gel et al.l l2005). 
The cosmological parameters used for the MS (Q m = 0.25, 
Q DE = 0.75, Q b = 0.045, cr 8 = 0.9, h = 0.73, w = -1.0, 
n s = 1.0) also define the fiducial cosmological model used 
throughout the paper. 



We have used the ray-tracing code described in lHilbert et al.l 
(2009) to obtain 32 realisations of a 4 x 4 deg 2 field, thus cov- 
ering 512 deg 2 in total. The matter distribution along the back- 
wards light cone of the observer is obtained by the periodic con- 
tinuation of simulation snapshots of increasing redshift. It is then 
divided into slices of a thickness of « 100/z~'Mpc, which are 
subsequently projected onto lens planes. The periodic repetition 
of structures along the line of sight (l.o.s.) is prevented by choos- 
ing a l.o.s. direction that is tilted with respect to the boundaries 
of the simulation box. The advantage of this technique in com- 
parison to the random transformation approach is that the matter 
distribution is continuous across slice boundaries and that large- 
scale correlations extending beyond the redshift slices are main- 
tained. The code follows a set of light rays, which form a grid on 
the first lens plane (the image plane), through the array of lens 
planes. At the same time, the Jacobian matrices of the lens map- 
ping from the observer to the lens planes are computed using a 
recursion formula. 

To create realistic, lensed mock galaxy catalogues, we com- 
bine the ray-t racing with the semi - analyt ic model of galaxy 
formation by iDe Lucia & Blaizotl d2007l) . making extensive 
use of the public Millennium Simulation database (Lemson & 
Springel 2006; Lemson & the Virgo Consortium 2006). We use 
the method outlined in Hilb ert et al.l ([2009) to obtain the lensed 
positions and observed magnitudes (taking the magnification 
due to lensing into account) for all galaxies in the semi-analytic 
model with M ste iiar ^ 10 9 h" 1 M G . In addition, the galaxy forma- 
tion model yields the masses of the disk and spheroidal (hence- 
forth bulge) component and the disc radius (which c an be 
zero). As described in more detail in Hilbert et al. (2008), we 
complement this with an estimate of the comoving radius of the 
spheroidal component of the galaxy given by 



r bulge = 0.54(1 +z)' 



0.55 



M hulR 



10 W h- l M F 



0.56 



hr l kpc , (2) 



which combines th e size distribution of galaxies measured by 
IShen et aD d2003[) and t he red shift evolution of galaxy sizes 
found by iTruiillo et alJ d2006h . Each galaxy is then assigned 
an effective radius r e = max(rdi sc , '"bulge)- The angular diameter 
of the galaxy is given by 9 e = y/7 r e / /Hw), where /k(w) is the 
comoving angular diameter distance to the galaxy, and fi is the 
lensing magnification at the position of the galaxy. The resulting 
distributions of angular and comoving galaxy radii for a sim- 
ulated galaxy survey with a magnitude cut of tsdss = 25 are 
shown in Fig.Q] 

We construct our catalogues for cosmic shear measurements 
by selecting galaxies brighter than three different cuts in the 
SDSS r-band (j"sdss = 24, 25, 26). Unless otherwise stated, 
we assume that these cuts are the same as the limiting magni- 
tude of the survey, rj,™ ss . For comparison, however, we will also 
consider the case where tsdss < r s™ss- Since we assume that 
the check for overlapping light distributions is done before the 
galaxies are selected for shape measurement, galaxies that are 
brighter than tsdss and have faint close neighbours with mag- 
nitudes between tsdss and r l ^ ss are removed from the lensing 
catalogue as well. 

Furthermore, we use two criteria to identify pairs of objects 
whose projected angular separation 9 is too small for obtaining 
reliable shape measurement of the individual galaxies: 

- According to the first criterion, two galaxies at 6\ and 62 with 
angular separation 9 - \6\ - 02\ are both removed from the 
catalogue if 9 < a(9' e 1 +ff 2 ). Here, the effective angular radii 
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Table 1. Galaxy number densities for the HLR criterion 



4 6 8 

r e [h' 1 kpc] 

Fig. 1. Distribution of galaxy radii in the simulated catalogue 
with rsDSS = 25. Upper panel: angular radius (no seeing), lower 
panel: comoving physical radius. 



are given by 



4 



(3) 



where see is the size of the seeing disk. The parameter a 
can be chosen arbitrarily to tune the strictness of the selec- 
tion criterion and to compensate for inaccuracies of our mod- 
elling of the galaxy radii. Since this criterion depends on the 
half-light radius of the galaxies, we henceforth denote it with 
"HLR". 

- The second criterion (called "FIX" criterion) is similar to 
what is used for, e.g., the CFHTLS (see also Van Waerbeke 
et al. 120001: iMaoli et al.ll2001h " It uses a fixed angular sepa- 
ration threshold: if a pair of galaxies fulfils 6 < 0^, one of 
the two galaxies is selected at random and removed from the 
catalogue. The rationale for doing this is the following: even 
though the light distribution of the remaining galaxy is still 
affected by the light of the removed neighbour, the resulting 
error of the shape estimate should be uncorrected with any 
other galaxy that remains in the catalogue and should just 
add to the noise. In addition to this, we remove all galaxies 
that are members of obviously severely blended pairs by ap- 
plying the HLR criterion with a — 1 . We find, however, that 
the effect of this second step on the properties of the resulting 
galaxy catalogue is generally sub-dominant. 

The choice of the selection criterion and its parameters most 
likely depends on the quality of the data at hand and the shape 
measurement pipeline used. In general, one wishes to retain as 
many galaxies as possible while keeping the bias caused by 
isophote overlap below a certain threshold. 

We remark that a multitude of variants of these selection cri- 
teria can be conceived, where for example for the FIX criterion, 
not a random galaxy is removed from a close pair, but the galaxy 
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number density compared to the values without removal of close pairs. 
These are 11.4, 25.2 and 50.7/arcmin 2 for tsdss = 24, 25 and 26, re- 
spectively. 

Table 2. Galaxy number densities for the FIX criterion 
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with the lowest signal-to-noise ratio (SNR); a related possibil- 
ity would be to always keep close pairs when the SNR of one 
galaxy is considerably larger than the SNR of the second. Using 
these variants is expected to lead to minor quantitative, but not 
to qualitative changes of the results presented in the following 
sections. 

In Tables [TJ and [2] we list the galaxy number densities after 
applying the HLR and FIX criteria, respectively, for various val- 
ues of a, see and 6^. The values given in parentheses are the 
fractional decrease of the number density compared to the unfil- 
tered galaxy catalogue. As expected, the deeper the survey and 
the more restrictive the criterion, the more galaxies are removed, 
since the probability of overlap is proportional to the projected 
galaxy density and the square of the threshold radius of the se- 
lection criterion. 

We compute the shear correlation functions from our simu- 
lated galaxy catalogues using Eq. ([TJ. We obtain the observed 
galaxy ellipticities e using the relation (ISchneider&Seitzlll995h 



if \g\ < 1 
if \g\ > 1 



(4) 



where g is the reduced shear obtained from the ray-tracing simu- 
lations and e (s) is the intrinsic ellipticity. For measuring the bias 
caused by the selection criteria described above, we set e (s) = 
in Eq. ©. We include intrinsic ellipticities only when comput- 
ing the covariance matrix of the shear correlation functions for 
the discussion in Sect. [6] 

4. The effect of object selection 

4.1. Effect on the redshift distribution 

Using a selection criterion like the ones described in the previ- 
ous section has two undesirable side effects: first, the redshift 
distribution of the filtered galaxy catalogue is different from the 
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Fig. 2. Upper panel: Ratio of the redshift distributions for 
/"sdss = 25 after and before object selection for the HLR cri- 
terion with a — 2 (thick solid line: t9 see = Of.'O, thin solid line: 
&see = OT/65) and the FIX criterion with 0^ x - 3'.'7 (thick dashed 
line). The case with a deeper limiting magnitude (?"s™ss = 26) 
than the magnitude cut for the lensing catalogue is represented 
by the double-dotted blue line. Lower panel: Redshift distribu- 
tions of the unfiltered galaxy catalogues with limiting magni- 
tudes tsdss = 24 (solid line), rsDSS = 25 (dashed line) and 
'"sdss = 26 (dotted line). 



one before object selection. This is illustrated in Fig. |2 where 
we show in the lower panel the redshift distributions of our three 
mock surveys with different magnitude cuts without applying 
any selection criterion. The upper panel displays the ratios of 
the redshift distributions after and before object selection for the 
survey with rsDSS = 25. We also consider the case of a lim- 
iting magnitude of the survey that is deeper than the cut used 
to define the sample of galaxies used for shape measurements 
( r SDSS = 26, whereas rsDSS = 25). Seeing only has very little 
effect on the results obtained with the FIX criterion, because the 
size of the seeing disk is typically much smaller than the fixed 
threshold radius 0f\ x ; we therefore only consider the case with 
<9 see = (X'65. 

In all cases, the largest fraction of the galaxies is removed at 
low redshifts. These galaxies have the largest apparent radii and 
thus have the largest probability of isophote overlap. The ampli- 
tude of the deviations is highest for rj,™ ss > rsDSS, because addi- 
tional pairs are removed in which one galaxy is from the magni- 
tude range [>sdss, 'sdss^ ^ or tne criterion an d the HLR cri- 
terion with seeing, a secondary dip occurs, approximately at the 
redshift of the peak of the redshift distribution. This behaviour 
is due to the presence of angular clustering. 

We illustrate this by constructing a simple analytical model, 
for which we subdivide the galaxies into redshift slices of width 
dz. We assume that there are no angular cross-correlations be- 
tween slices at different redshifts. Furthermore, we use a power- 
law model for the angular correlation function, so that it is given 
by <o(0;z,Z?) = A{z)0~ y if Z = zf and w(6»;z,z') = other- 
wise. The galaxy radii are given in a deterministic fashion by 

e (z) = V ' r e(z) I fK[w(z)] + 6*see, where r e (z) can be considered to 
be the mean radius of all galaxies in a thin redshift bin centred 
on z. The probability of finding a galaxy with redshift z' within 
an annulus of radius and width d0 around a galaxy at redshift 
z is thus dp(z,z') = 2n0d0[l + aj(0;z,z')]N(z')/n, where O is 
the total area of the survey, and N(z') is the number of galaxies 
in the redshift slice centred on z'. Two galaxies have overlapping 
isophotes if they are closer than e g(z, z') = e (z) + e (z') (corre- 
sponding to the HLR criterion with a = 1, which we choose 



here for simplicity). The total probability of overlap for two 
galaxies is given by the integral of dp(z, z') over a circle with 
radius s s(z,z'). Finally, we obtain the total number of galaxies 
removed from the slice at z by summing up the contributions 
from all redshift slices: 



AN(z) = 



2ttN(z) 



J dz' WzOj 



d00 [1 +co(0;z,zf)] ■ 



(5) 

Simplifying and inserting our model for the correlation function, 
we obtain 



AN(z) 
N(z) 



n 



jdz>0l a (z,z')N(z') + ^^&(z,z) 



, (6) 



where the second term accounts for the effect of galaxy cluster- 
ing. We can simplify this even more by using a constant cluster- 
ing amplitude and a redshift-independent radius # for all galax- 
ies, so that e ff (z, z') = 2ft. We then find that 



AN(z) _ n_ 
N(z) ~ Q 



AN(z) 

2-r 



(20) 



2-r 



(7) 



We see that in the absence of angular correlations, a constant 
fraction of objects is removed from the total galaxy popula- 
tion, given only by the fraction of the total area covered by 
galaxies. This leaves the shape of the redshift distribution un- 
changed. Galaxy clustering increases the probability of overlap- 
ping isophotes. However, this is effective only for galaxies in the 
same redshift slice. The fraction of blended objects in a slice is 
proportional to N(z) (and not A^ to t as without clustering). This 
causes the secondary minimum seen in the upper panel of Fig. [2] 
for those selection criteria where e {z) approaches a finite, con- 
stant value as z increases. This is the case if seeing is present, 
as well as for the FIX criterion. If, on the other hand, 6 e (z) is 
allowed to fall to zero, as for the HLR criterion without see- 
ing, this suppresses the clustering term in Eq. (0, because the 
galaxy radii (and thus 6^ ff r ) are already close to zero when N(z) 
approaches its maximum. 

The change of the redshift distribution is relevant if the red- 
shifts of the individual galaxies are not available. In such cases, 
p(z) is usually inferred from a sub-sample or a similar survey 
with either spectroscopic or photometric redshifts. In general, 
the objects used for these calibration samples are selected in 
a different way than the galaxies for the shear catalogue, and 
therefore the redshift distribution obtained in this way does not 
account for the change of p(z) due to object selection. For up- 
coming lensing surveys incorporating photometric redshifts, this 
should be less of a concern, since in this case the redshift dis- 
tribution of the galaxies in the shear catalogue can at least in 
principle be estimated directly. 

4.2. Density-dependent galaxy selection 

The second, and probably more severe effect of using selec- 
tion criteria such as HLR and FIX arises because the selec- 
tion is density-dependent. Since the galaxy distribution is cor- 
related with the underlying density field, a mass overdensity in 
the foreground also implies an overdensity of galaxies. This in 
turn implies a higher probability of isophote overlap and thus of 
the removal of galaxies. Therefore, the fraction of galaxy pairs 
that can be formed from galaxies located behind overdensities 
is decreased relative to all galaxy pairs that contribute to the 
shear correlation function estimator for a certain angular sepa- 
ration bin. High-density regions are therefore effectively down- 
weighted compared to the case without object selection. The 
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opposite is true for underdense regions: the probability that a 
galaxy behind the underdensity is filtered is reduced, and more 
pairs than for a region of average density contribute to the shear 
correlation functions. This re-weighting is further modified by 
the fact that the fraction of galaxies that is removed from the 
fore- and background of a specific lens is not constant (see 
Fig. ©. The ratio of the number of foreground-foreground and 
foreground-background pairs, which do not carry information 
about the lens, to the number of background-background pairs, 
where both galaxies have been sheared by the lens, depends on 
the lens redshift. If relatively more pairs in the background than 
in the foreground are removed, the signal of the lens is further 
suppressed (and vice versa). 

The net effect of all this is that the shear correlation estimator 
given by Eq. ([TJ is no longer unbiased. The reason for this is that 
the weights p, are no longer uncorrected with the galaxy ellip- 
ticities, because the selection weights m, depend on the projected 
galaxy density through the mechanism described above. 

4.3. Simulation results 

We define the bias due to object selection as 



(8) 



where £jf are the shear correlation functions after filtering for 
close pairs, and are the correlation functions computed from 
all galaxies in the field of view. The superscript "z + S" indicates 
that includes the bias both due to the change of the redshift 
distribution and the density-dependent selection of galaxies. 

In the upper panels of Figs. [3]and[4] we show the fractional 
bias 



(9) 



for the HLR and FIX criterion, respectively. The error bars have 
been computed from the field-to-field variation between the 32 
ray-tracing realisations. For both criteria, g£ is biased high by 
several percent on large scales, whereas on small scales this bias 
can become negative. The behaviour for large 9 can be explained 
by the change of the redshift distribution, which gives more 
weight to high-redshift galaxies, which carry the strongest shear 
signal (see Fig. [2J. On small scales, the negative bias begins to 
dominate due to the density-dependence of the way galaxy pairs 
are selected (see below). As discussed before, seeing is only rel- 
evant for the HLR criterion and was therefore not considered in 

Fig. a 

If photometric redshifts are available for all galaxies, the cor- 
rect redshift distribution after object selection can be estimated, 
and one is only interested in the systematic effect caused by the 
density-dependence of the galaxy selection. To quantify this, we 
therefore would like to compare £® eI to fiducial correlation func- 
tions that were computed using the correct redshift distribution 
and with galaxy pairs selected in a fair way, i.e. uncorrected 
with the density field. To this end, we take the unfiltered galaxy 
catalogues (the ones that led to sort the galaxies into red- 
shift bins, and randomly remove galaxies from each bin so that 
the resulting new galaxy catalogue has the same redshift distri- 
bution as the catalogue after applying one of the selection crite- 
ria. We denote the correlation functions computed from the new 
catalogues by £± corr , and define the bias only due to the density- 
dependence of the galaxy selection process (indicated by the su- 
perscript "(5") by 
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Fig. 3. Fractional bias of the shear correlation functions for the 
HLR criterion, without (upper panels) and with (lower pan- 
els) correction for the change of the redshift distribution. Thick 
dashed lines are for a = 1, thick solid lines for a — 3 with- 
out seeing. For the respective thin lines a seeing of 0765 was 
assumed. The shaded region shows the lcr-error. For better visi- 
bility, it is shown only for the case of a = 3, # see = 0". The error 
bars for the other cases are very similar. 
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Fig. 4. Same as Fig. [3] but for the FIX criterion with 6f\ x = 2" 
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#g x = 5.0" (dot-dashed blue curves). 



Accordingly, the corresponding fractional bias is given by 

jS*(0> 



£* on '(6») 



(ID 



\6) 



(10) 



We show the simulation results for in the lower panels 
of Figs.[3]and|4] The bias is now consistent with being negative 
for all angular separations, as expected from the qualitative pic- 
ture described in Sect. 14.11 The effect is most severe for small 
9, whereas A^* seems to asymptotically approach zero on large 
scales. Even after correcting for the change of the redshift dis- 
tribution, the bias is of the order of several percent and therefore 
constitutes a potentially significant contaminant for present and 
future cosmic shear surveys. 

We compare a survey with a limiting magnitude of 
r SDSS = an d a cut f° r tne lensing catalogue of tsdss = 25 to a 
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Fig. 5. Comparison of the fractional bias of f± for a survey with 
limiting magnitude r^ ss = 26 and a magnitude cut for the 
galaxies that are used for shape measurements of tsdss = 25 
(solid lines), to the bias for a survey where rj.™ ss = '"sdss = 25 
(dashed lines). For both cases, the HLR criterion with a — 2 was 
used. Thick curves display the case without seeing, thin curves 
the case with see = (X'65. 



survey with rj,™ ss = tsdss = 25 in Fig. [5] using the HLR crite- 
rion with a — 2. While A£| +l5 increases by « 1% in the case with 
r SDSS_ = ^ c ' ue to tne cnan g e m redshift distribution (see also 
Fig.|5]i, no significant differences between the two surveys can 
be found if the correct redshift distribution is known. The rea- 
son for this is that although more galaxies are removed when the 
deeper limiting magnitude is used, the bias is primarily due to the 
change of the relative weights of over- and underdense regions 
in the correlation function estimator. These weights only depend 
on the relative change of the number of galaxy pairs behind such 
structures. The same argument can be used to explain the results 
displayed in Fig. |6l where we investigate the behaviour of the 
bias (taking the change of p(z) into account) for various magni- 
tude cuts (using the HLR criterion). We find that, for the mag- 
nitude cuts and the resulting redshift distributions of galaxies 
considered here, the fractional bias A£[ depends only very little 
on the survey depth. The only notable difference occurs on small 
scales, where the bias for deeper surveys is slightly less severe 
than for shallower ones. 



5. Weighting scheme 

The bias discussed in the previous sections is caused by the re- 
moval of galaxy pairs in a way that is correlated with the den- 
sity field. This suggests that the bias could be reduced by in- 
creasing or decreasing the relative pair count behind over- and 
underdensities, respectively, to "fair" levels. Such a procedure 
would reduce both the bias due to the change of p(z) and due to 
the density-dependent selection. We propose that, if photo-z es- 
timates are available also for the galaxies that have been filtered 
out, this can be achieved by identifying the nearest neighbour 
of a removed object on the sky, and doubling its weight for the 
correlation function computation (i.e. using it twice in the shear 
catalogue). We demand close proximity on the sky and in red- 
shift to ensure that the shear of the neighbour is a reasonable 
proxy for the shear at the position of the filtered galaxy. 

Since the geometrical lensing weight functions change rela- 
tively slowly with comoving distance, it is sufficient to choose 





- 




-2 - 


oT 


-4 - 


«-i -+- 


-6 ' 








-8 i 




- 




-4 - 




-8 - 


•o 1 


-12 - 




-16 - 


< 




HLR, a = 2, 9 see = 0.65" 




1 10 
6 farcminl 

Fig. 6. Fractional bias of the shear correlation functions for 
various survey depths using the HLR criterion with a — 2, 
#see = (X'65, corrected for the change of the redshift distribu- 
tion. Solid black lines with error bars: tsdss = 24; blue dashed 
line: rsDSS = 25; red dotted line: tsdss = 26 
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Fig. 7. Upper panel: distribution of the separation of rejected 
galaxies from their nearest accepted neighbour in a slice of thick- 
ness Aw; a magnitude cut of rsDSS = 25 and the HLR criterion 
with a = 2, ftjee = 0765 were used. Lower panel: same as upper 
panel, but for slices with thickness given by Az p hot- Vertical lines 
indicate mean separations. 



an object from a redshift slice centred on the removed object 
with a certain width (a few hundred Mpc). This also helps find- 
ing a neighbour that is sufficiently close to the removed galaxy 
on the sky; clearly, the larger the slice width, the smaller is 
the projected nearest-neighbour distance of objects in the slice. 
This is illustrated in Fig. [71 where we show the distribution of 
the distance from a rejected galaxy to its nearest neighbour for 
slices with widths specified in comoving distance (upper panel) 
or photometric redshift (lower panel). The latter were obtained 
by simulating the photo-z accuracy in a typical contemporary 
weak lensing survev. We use the recipe described in Hildebrandt 
et al. (120071 120091) to simulate a multi-colour catalogue based 
on realistic distributions of redshift, spectral type, magnitude 
and magnitude error, closely resembling the CFHTLS-Wide. 
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Fig. 8. Fractional bias of £ ± using the weighting scheme de- 
scribed in Sect. [5] Thin curves: fractional bias after applying the 
weighting scheme for slices with thickness Az p h t; thick solid 
green curve: A^ s without weighting scheme, thick dot-dashed 
green curve: without weighting scheme. Error bars were 
computed from field-to-field variation and for better visibility 
are shown only for one case. 



Final ly, photo-z's are estimated with the BPZ code dBenftezI 
2000). Comparisons of the simulat ed photo-z accurac y to the one 
obtained from real CFHTLS data (Erben et al. 2009) show good 
agreement. For the thickest slices considered (Az p hot = 0.2), the 
mean distance to the nearest neighbour is 44", and decreases to 
22" for the slice with Az pho t = 0.05. 

In Fig. [8] we compare the fractional bias of the correlation 
functions measured after applying the weighting scheme, 



£f(0) 



(12) 



where 



(13) 

to the original A^ + * and A£*. We only consider slices defined 
in terms of photometric redshift; the corresponding results for 
slices with a given comoving thickness are very similar. The sug- 
gested procedure clearly reduces the bias, in particular on large 
scales. Its performance degrades on scales below angular separa- 
tions of « 1'. The reason for this is that the selection of the near- 
est neighbour effectively corresponds to a smoothing of the shear 
field with smoothing length comparable to the mean nearest- 
neighbour distance, because it is assumed that the shear of the 
removed galaxy is the same as the one of its substitute. Since 
is more sensitive to small-scale power than it should be 
particularly affected, as can indeed be seen in the lower panel of 
Fig-El The deviations seen for large 6 are consistent with noise. 
On the other hand, the actual width of the redshift slice and, 
related to this, the actual value of the mean nearest-neighbour 
distance have very little effect on the quality of the method, al- 
though there is a slight tendency for thicker slices to yield values 
of A£™ that are larger by a few fractions of a percent. This means 
that the weighting scheme is relatively insensitive to the quality 
of the photometric redshifts. This is particularly important for 
those galaxies which have been filtered out because their light 
distribution is contaminated by a close neighbour, which also 
adversely affects the accuracy of their photo-z estimates. 



6. Implications for cosmological parameters 

To illustrate the importance of the object selection bias for pa- 
rameter estimation, we perform a likelihood analysis to fit for 
the parameters n = (Cl m , <Ts,wq) (assuming a flat universe). 
Our fiducial cosmological model, denoted by 7To, is that of the 
Millennium Simulation (see Sect. [3j. We use the galaxy sample 
with rsDSS = 25 from our ray-tracing simulations for simulating 
a survey with an area of 1500 deg 2 . To each galaxy, we assign 
Gaussian ellipticity noise with dispersion <x e = 0.4. We consider 
correlation functions given on ten logarithmically spaced bins in 
the range from 1' to 80', and assume a Gaussian likelihood of 
the form 



/>< en) c, cxp min)] 1 CT 1 [f - m{n)] 



(14) 



Here, £ = . . . , f + (0„), f_(0,), . . . , £_(#„)]' is the mea- 

sured correlation function (see below), written in vectorial form, 
and m(n) is the model prediction ba sed on the th r ee-di mensional 
matter power spectrum as given bv lSmith et al.l (l2003h . The co- 
variance matrix C has been estimated from the field-to-field vari- 
ation of the ray-tracing realisations. When computing its inverse, 
we correct for the bias caused by the nois e in the estimate of C 
by using the correction factor described in lHartlap et alj d2007l) . 

Since we are only interested in the effect of removing 
close galaxy pairs and not in the bias caused by the mismatch 
of the theoretical model an d the correlation fu nctions in the 
Millennium Simulation (see Hilb ert et al J 120091) . we construct 
the data vectors £ from the model for our fiducial set of param- 
eters, /w(tto), and the bias jS due to object selection, measured 
from the ray-tracing simulations: 



(15) 



For j8, we consider three cases: (a) assuming no knowledge of 
the true redshift distribution after removing blended galaxies 
(i.e. using f3± from Eq. [8}, (b) assuming that the change of the 
redshift distribution has been taken into account (using /3 S ± from 
Eq. [Tol l, and (c) assuming that the weighting scheme described 
in the previous section has been applied (using /?™ from Eq.[T3l). 

In Fig. [9] we show the results of this procedure for both the 
HLR criterion with a = 2 and see = ff.'65, and the FIX criterion 
with #g x = 3'.'7. In Tab. [3] we show the fractional shifts of each 
cosmological parameter with respect to its fiducial value for the 
various cases. As expected from the larger amplitude of the bias 
in the correlation functions for the HLR criterion (see Figs.[3]and 
|4), the deviation of the maximum-likelihood points from the true 
values is generally larger for the HLR criterion than for the FIX 
criterion. In both cases, the parameter estimates are off by sev- 
eral percent. Interestingly, knowledge of the correct redshift dis- 
tribution does not necessarily improve the parameter estimates, 
which is particularly striking in the £2 m - erg -plane. The bias of the 
maximum-likelihood estimates of all parameters is reduced by a 
significant amount if the proposed weighting scheme is applied, 
as could be expected from the likewise reduction of the bias in 
the shear correlation functions (see Fig. [8]). While not being a 
perfect solution that would be accurate enough to be applied to 
planned large-area surveys, the method works sufficiently well to 
reduce the bias to a level that is well below the statistical errors 
for surveys like the CFHTLS (at least for the non-tomographic 
case considered here). Furthermore, a comparison of the correla- 
tion functions measured with and without the weighting scheme 
may be used to assess the importance of the object selection bias 
for a given survey. 
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Fig. 9. Likelihood analysis of the bias caused 
by the removal of blended galaxies. Each panel 
shows the 1- and 2cr confidence contours ob- 
tained by marginalizing over the remaining pa- 
rameter (computed for the HLR criterion with 
a = 2 and 6> scc = 0765). The fiducial parame- 
ter values are marked with crosses. Circles in- 
dicate the maximum likelihood estimates as- 
suming that the true redshift distribution after 
object selection is unknown, squares show the 
maxima if the correct p(z) is used, and triangles 
give the estimates after applying the weight- 
ing scheme of Sect. [5] For the HLR criterion 
(ff = 2; 9 m = 0765), filled symbols have been 
used, open symbols for the FIX criterion with 
0fl x = 3.7". For better visibility, the estimates 
for each criterion type have been connected 
with a line. 



Table 3. Fractional bias of cosmological parameters 
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7. Summary and conclusions 

We have described a new, so far unconsidered systematic ef- 
fect affecting the measurement of the shear correlation functions. 
The cause of the bias is the common practice of removing galax- 
ies from the lensing catalogue that have very close neighbours, 
in order to avoid isophote overlap. While this filtering is nec- 
essary for obtaining clean shape estimates, it has two adverse 
effects on the correlation function estimate. The first consists in 
altering the redshift distribution of the galaxy catalogue. This 
is most important for low redshifts (where the angular sizes of 
galaxies are large) and (as a result of the angular clustering of the 
galaxies) near the peak of the redshift distribution. Second, such 
filtering predominantly removes galaxies that lie behind over- 
dense regions. Therefore, fewer pairs of galaxies can be formed 
that carry the shear signal of the overdensity, effectively down- 
weighting it in the correlation function estimator. For similar rea- 
sons, underdense patches of the sky receive a higher weight. As 
a result of this, the estimate of the shear correlation functions 
obtained from a galaxy catalogue, from which close pairs have 
been removed, is biased. 



In order to quantify this bias, we have run ray-tracing simu- 
lations through the Millennium Simulation in conjunction with 
a semi-analytic model of galaxy formation and observed scaling 
relations for the radii of the galaxies. We consider two different 
selection criteria, one that removes close pairs of galaxies closer 
than a certain threshold separation which depends on the radii of 
the galaxies, and the other removing one galaxy of a pair that is 
closer than a certain fixed threshold. We find that the change of 
the redshift distribution due to filtering is of the order of several 
percent; however, this can in principle be dealt with if photomet- 
ric redshifts are available for all galaxies. 

The effect of the density-dependence of the galaxy selection 
varies with angular separation. We find that on scales of a: 1', 
the shear correlation functions are biased low by typically sev- 
eral percent; the bias decreases for larger angular separations. 
The bias seems to be almost independent of the survey depth. 
While seeing has essentially no effect on the bias when a fixed 
threshold radius is used to define close pairs, adding seeing to the 
simulations can significantly increase the bias for the selection 
criterion depending on the sizes of galaxies. 

We note that the bias studied here is different from the ef- 
fects of the clu stering of source galaxies previously di scussed in 
the literature (lBernardeaulll998l : ISchneider et al.ll2002bl) : the re- 
moval of close galaxy pairs creates an anti-correlation between 
foreground and background galaxies, and thus between the lens- 
ing matter distribution and the galaxies that are used to trace the 
shear field caused by the matter in the foreground. This induces 
clustering between galaxy populat ions that are widely sep arated 
in redshift, whereas the effect of Schn eider et alJ (1200 2b) arises 
from the clustering of source galaxies that are at very similar 
redshifts and which need not be related to the dark matter distri- 
bution at all. 
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We have investigated the impact of the new systematic ef- 
fect on estimates of £2 m , cr 8 and wo, assuming a flat universe and 
keeping all other parameters fixed. Irrespective of whether the 
correct redshift distribution is used or not, we find shifts of the 
maximum-likelihood estimators of several percent. The situation 
can be significantly improved by using different weights for the 
galaxies that are eventually used for measuring the correlation 
functions. The weighting scheme consists of double-counting 
the nearest neighbour (from within a redshift slice with thick- 
ness of a few hundred Mpc) of a galaxy that has been removed. 
This requires photometric redshift estimates to be available also 
for the galaxies that have been removed by the selection crite- 
rion. We find that the method works well even for slices as thick 
as Azphot = 0.2, so that the requirements on the quality of these 
redshift estimates are relatively low. The weighting scheme re- 
stores the pair count to fair levels and substitutes the shear of 
the filtered galaxy with the shear of the nearest neighbour. The 
scheme is surprisingly independent of the actual width of the 
redshift slice and reduces the bias of the correlation function to 
levels of < 1% for angular scales ranging from « 2' to « 80'. 
Accordingly, the bias of cosmological parameter estimates is 
also significantly reduced. 

Given the amplitude of the bias of the shear correlation func- 
tion, this new systematic effect has the potential of being very 
significant. The weighting scheme we propose is a first step to- 
wards controlling it, but it probably lacks the accuracy necessary 
for the next generation of weak lensing experiments. 
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